Production in yeasts of stable antibody fragments

ABSTRACT

Associative portions of antibody light and heavy chains, especially Fv fragments, are expressed in a transformed organism as a single peptide chain connected by a linking peptide. This is cleaved, possibly while peptide synthesis is incomplete, by an enzyme of the transformed organism which is expressing the single peptide. This ensures production of both chains in equal amounts, but produces them as separate chains which are free to associate and fold.

This invention relates to the production of antibody fragments andanalogous entities. In recent years there has been considerable interestin the production of antibody fragments. One fragment of particularinterest is the so-called Fv fragment which consists of the variabledomain of the light chain of an antibody and the associated variabledomain of the antibody's heavy chain. The inter-molecular forces whichbring about the association of these domains in whole antibodies alsohold them together in an Fv fragment.

The individual domains of an Fv fragment can each be expressed by agenetically transformed organism. Once they are present together insolution they will spontaneously associate to give the desired Fvfragment.

The expression of single antibody domains by a transformed microorganismis discussed at length in European Patent 368684 (Medical ResearchCouncil).

If the required variable domains from the light and heavy chains arepresent together in solution they will in general spontaneouslyassociate together into the required Fv fragments; association isparticularly favoured by a high concentration of the domains. It is nothowever straightforward to get both of the variable domains produced andin solution together in equal amounts and in appropriate concentrationswhich favour association. One possibility is to express them both bymeans of separate host organisms and then bring them together. This,however, requires the transformation of two host organisms and theamounts of the variable domains which are expressed may not match eachother. Moreover, it is generally thought that the expression of theheavy chain without the light chain is often harmful to the cells whichexpress it, making it difficult to obtain concentrations suitable forindustrial production.

Another possibility is to transform a host organism so that a singletransformed organism contains genetic information coding for bothvariable domains. This can be achieved by assembling the geneticinformation coding for both domains on a single vector as disclosed byReichmann et al, J. Mol. Biol. 203 825 1988!. However, in this case thehost organism may not express the two variable domains in equal amounts,thereby wasting cellular metabolism in unproductive synthesis, and againrisking harm to the cells from the surplus of one chain.

A way around the difficulties is proposed in European Patent 281604Bwhich discloses the production of a single polypeptide containing thebinding portion of each variable domain along with a linking peptidesequence which joins them together. This linking peptide sequence isdesigned so that after the single polypeptide has been expressed, thebinding portions of the two variable domains can associate together toform a molecule analogous to an Fv fragment which is the so-calledsingle chain Fv fragment. European Patent 281604B brings two keyadvantages. First, the two domains are produced in equal quantities.Secondly, the two domains are produced at high "local"concentration--since they are linked--and therefore association isstrongly favoured.

It is explained in this prior document that the design of this singlepolypeptide molecule necessitates some compromise. It is taught that thelinking region should extend from the C-terminal region of the lightchain to the N-terminal region of the heavy chain. However it should notjoin the extremities of these terminal regions because they arerelatively far apart in a natural antibody, (and likewise in a completeFv fragment).

Instead the peptide link should extend from a point spaced somewhatinwardly from the C-terminal of the light chain to a point spacedsomewhat inwardly of the N-terminal of the heavy chain, these beingpoints which are somewhat closer together in the natural antibody. Theconsequence of this is that a portion of the light chain adjacent itsC-terminus and a portion of the heavy chain adjacent its N-terminus isnot expressed and instead is replaced by the linking peptide. Even sothe peptide link must be designed with some care so that it is ofsufficient length to permit the two variable domains to fold andassociate together.

EP-A-623679 which is a divisional out of EP-A-318554 also discloses theexpression of a single polypeptide containing the binding portion ofeach variable domain along with a linking peptide sequence joining themtogether. The document teaches that the link should have a length of atleast 10 amino acids, and mentions that the linking peptide couldinclude a cleavage site recognizable by a site specific cleavage agent.It is stated that this could allow the V_(H) and V_(L) domains to beseparated later, or the linker to be excised after folding at thebinding site. The document does not elucidate how such a construct wouldbe processed or utilised. However, it suggests that linking the V_(H)and V_(L) domains together may do little or no harm, and may evenimprove, binding properties.

Although the genetic constructs encoding single-chain Fv have some clearadvantages in the production of antibody fragments, the resultantsingle-chain Fv protein is disappointing in its performance comparedwith the Fv fragments which have no link between the light and heavychains. This is illustrated below by comparing the stability of the twodifferent protein structures, stability being a very importantperformance criterion for industrial applications. Although single-chainFv fragments are more stable than ordinary (two chain) Fv fragments whensubject to prolonged storage at 37° C., we have observed that they arecompletely inactivated by some biophysical shocks such as a series offreeze/thaw cycles. In contrast, true (two-chain) Fv can survivefreeze/thaw cycles with very little activity loss. A possibleexplanation for this extremely surprising discovery argues that thesingle-chain Fv is intrinsically more resistant to denaturation due tothe covalent coupling of its two component domains (via the linker),therefore it can survive longer periods of storage at 37° C. thantwo-chain Fv; however, if the single-chain Fv is completely denatured bya biophysical shock, the presence of the linker interferes with therefolding process which would otherwise take place on a return tophysiological conditions. In contrast, two-chain Fv can survivebiophysical shocks by being denatured and then refold successfully.Whatever the explanation, antibody fragments intended for industrialapplications should be robust enough to cope with perturbations likelyto denature them such as elevated temperatures, freeze/thaw, or extremesof pH. A biophysical shock, sufficient to denature Fv may for instancebe used to remove the Fv from an affinity column during purification.Consequently, the ability to refold spontaneously can be valuable. Also,single-chain Fv fragments have a tendency to give more non-specificbinding, which is usually not desirable.

In the present invention, associative portions of a heavy chain and alight chain, e.g. the binding regions of their variable domains, arealso expressed as parts of a single polypeptide in which they areconnected through a linking peptide sequence. However, this connectionincorporates a site for cleavage by an enzyme produced by thetransformed organism which is expressing the polypeptide. After orduring expression of the single polypeptide it is cut at the cleavagesite while still within the culture where it has been expressed it,thereby detaching the portions of the heavy chain and the light chainfrom each other and allowing them to associate spontaneously together.

Thus the present invention provides a method of preparing an Fv antibodyfragment or other product which incorporates associative portions of anantibody's light and heavy chains, by:

connecting nucleotide sequences which code for the portions of the twochains, by means of an additional nucleotide sequence which isinterposed between them and which codes for a linking peptide sequence;

transforming a host organism to incorporate the connected nucleotidesequences;

culturing the transformed organism to express a polypeptide whichcontains the portions of the light and heavy chains, joined by a linkingpeptide sequence coded by the said additional nucleotide sequence;

characterised by inclusion of a cleavage site in the linking peptidesequence such that the linking sequence is cut enzymatically by anenzyme produced by the transformed organism.

It should thus be appreciated that in this invention the transformedorganism synthesizes two peptide chains, both of which are desired, andboth of which go into the final product, but initially they are joined,and are separated from each other by the organism after synthesis of thejoint between them. The expressed, single polypeptide may exist for ashort period as a transition state, or it is possible that cleavage willoccur during the synthesis of its second chain. This can avoiddifficulty if this single, expressed polypeptide would otherwise betoxic to the host organism.

Possibly the enzyme which carries out cleavage could be an enzymeoccurring in the membrane of the transformed organism or even anextracellular enzyme that has been produced by the organism. In thisevent cleavage of the linking peptide would take place as the protein isexcreted through the membrane or in the surrounding culture medium.

This method generally leads to a product in which at least one andprobably both of the two portions of antibody chains is prolonged by aremnant of the linking peptide, although the remnant may be very small.It would be possible (although probably inconvenient) to design alinking peptide which is cut away completely

This invention is particularly envisaged for the production of Fvfragments. The associative portions of the two chains will be theirvariable domains or at least the binding regions thereof. The productwill then be an Fv antibody fragment in which at least one and probablyboth of the variable domains is attached to a remnant of is the linkingpeptide.

Because the peptide sequence which provides a link between the heavychain variable domain and the light chain variable domain is cut afterexpression of the single polypeptide, there is greater freedom of choicein choosing the length of the linking peptide between them. Moreoverthere is no necessity to omit terminal portions of the desired variabledomains for the sake of reducing the length of the link between them.

Although there is no need to omit terminal portions of the desiredvariable domains, this could nevertheless be done if desired. Generallythe nucleotide sequences will code for at least the binding regions ofantibody variable domains.

In one form of this invention, the link between the antibody chains issufficiently short, e.g. less than 10 amino acids, that the two chainscannot associate together until the link is cut. The result of this isthat (folded) single chain Fv is not produced as a transient product.

Nucleotide sequences which code for light and heavy chain variabledomains can be obtained by cloning of existing genetic material. Thiscan be done by means of the polymerase chain reaction (PCR) which iswell known in the field of biotechnology. Literature references for thistechnique are:

Saiki et al, Science 230 1350 (1985)

Scharf et al, Science 233 1076 (1986)

and

Saiki et al, Science 239 487 (1988)

Its application to the cloning of variable domains has been described byOrlandi, Winter et al, PNAS USA 86 3833 (1989) and in EP-A-368684.

The essence of the PCR technique is repeatedly carrying out a cycle ofsteps comprising:

exposing a required nucleotide sequence in a nucleic acid strand,

annealing a primer oligonucleotide adjacent an end of the requiredsequence, and

synthesising a complementary nucleic acid strand extending from theprimer,

these steps being carried out utilising a primer able to anneal to onenucleic acid strand adjacent to one end of the requires sequence and asecond primer able to anneal to the complementary nucleic acid strandadjacent the opposite end of the required sequence, thereby to produceclone strands of nucleic acid which are the required sequence withend-portions determined by the two primers.

A linking nucleotide sequence coding for the linking peptide can be madedirectly by oligonucleotide synthesis.

Assembly of the three nucleotide sequences to form an in-frame singlenucleotide sequence can be carried out with standard techniques ofrecombinant DNA technology. In order to facilitate this it is preferredthat the primers used in the PCR reaction provide restriction sites andthat the linking nucleotide sequence also incorporate restriction sites.

Standard techniques of recombinant DNA technology can be used totransform a host organism with the nucleotide sequence.

A possibility is to transform the genetic material of the host organismso that it also expresses the protease enzyme which will recognise thecleavage site in the linking peptide.

In a development of this invention, the linking peptide is designed suchthat it incorporates a sequence of amino acids which can be used forrecognition of the Fv fragments during assay and/or purification.Recognition of such a sequence would be utilised after cleavage, atwhich stage the recognition sequence should be present as a remnant ofthe linking peptide, attached to one chain of the Fv fragment.

In another development of this invention, the linking peptide sequenceis designed such that it incorporates two cleavage sites. When thepeptide sequence is cut at these sites, part of it is cut right out.Consequently the antibody fragment which is formed may carry only verysmall remnants of the linking peptide sequence, or no remnants at all.

Yeasts are presently preferred as organisms to be transformed and usedto express the peptide chain, which is then cut. In particularmethylotropic yeasts may be used, notably Pichia pastoris and theclosely related Hansenula polymorphia.

The preferred proteases to cleave the linking peptide are of theKEX2-type. Enzymes of this group of proteases are found in manyorganisms. They recognise, and cleave ajacent to, a specific cleavagesite of . . . Lys-Arg . . . provided that these are in an exposedposition rather than concealed by folding of the peptide chain. Cleavagetakes place next to arginine so that

    . . . Lys-Arg-X- . . .

is cleaved to

    . . . Lys-Arg-COOH+H.sub.2 N--X . . .

where X denotes any amino acid, and Arg-COOH indicates that aftercleavage the arginine residue is as C-terminus.

Expression and cleavage into two peptide chains within the transformedorganism will generally be followed by a step of harvesting or recoveryin which the desired Fv fragments are separated from other constituentsof the composition in which they have been formed. Notably, they willdesirably be separated from the enzyme which brought about the step ofcleavage. Techniques for the harvesting of biological molecules, such aspolypeptides, are well known. Affinity chromatography is widely used.

The invention could be applied to the production of bodies in which theportions of the two chains are prolonged with further polypeptides.These could include the variable domains of the chains of a secondantibody, thus leading to a product with two specific bindingaffinities, analogous to the diabodies described in Holliger et al PNAS90 6444 (1993). Once again there would be the advantage that therequired two polypeptides which associate together are made in equalamounts because they are expressed as a single polypeptide. In this casethe two polypeptides which are linked should respectively contain atleast one light chain and at least one heavy chain which will associatewith it.

When this invention is applied to the production of diabodies, thelinking peptide sequence which connects the polypeptides (and iseventually cut) may again be sufficiently short that association of thepolypeptides does not take place until the link has been cut.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will be described by way of example, withreference to the accompanying diagrammatic drawings in which FIGS. 1 to6 provide an illustration of the invention in principle, FIGS. 7 to 12illustrate an example of the invention and FIGS. 13 to 17 illustrate theapplication of the invention to diabodies. More specifically:

FIGS. 1 and 2 illustrate DNA sequences obtained by cloning;

FIG. 3 shows a sense strand of DNA, (Seq ID No. 1) made synthetically;

FIG. 4 shows a longer DNA sequence (Seq ID No. 1) obtained by ligatingthe sequences of FIGS. 1 to 3;

FIG. 5 shows the polypeptide coded by the DNA sequence (Seq ID No. 2) ofFIG. 4;

FIG. 6 shows an Fv fragment obtained from the polypeptide of FIG. 5;

FIG. 7 is a schematic representation of the construction of FvKC-II-KEX2and Yeast-FvKC-II-KEX2 starting from FvKC-II.

FIG. 8 shows the nucleotide sequence (Seq ID No. 4) and correspondingamino acid sequence (Seq ID No. 5) of the FvKC-II genes on theHindIII-EcoRI fragment in pUC19.

FIG. 9 shows the nucleotide sequence (Seq ID No. 9) and correspondingamino acid sequence (Seq. ID No. 10) of the FvKC-II-KEX2 gene on theHindIII-EcoRI fragment in pUC19.

FIG. 10 shows the nucleotide sequence (Seq ID No. 13) and correspondingamino acid sequence (Seq. ID. No. 14) of the FvKC-II-KEX2 genecontaining the 5' SnaBI restriction site on the HindIII-EcoRI fragmentin pUC19.

FIG. 11 is a chromatrogram obtained when the protein produced intransformed Pichia cells was recovered by affinity chromatography.

FIG. 12 shows SDS-PAGE analysis of the eluted protein.

FIG. 13 illustrates a DNA sequence used to produce a diabody;

FIG. 14 shows the polypeptide coded by the DNA sequence of FIG. 13;

FIG. 15 shows the diabody obtained from the polypeptide of FIG. 14.

In the embodiment of the invention illustrated by FIGS. 1 to 3, theprocedure commences with a hybridoma which produces a monoclonalantibody of the desired specificity.

The gene (DNA sequence) which codes for the heavy chain variable domainis cloned from the genome of this hybridoma by means of techniquesdescribed in Orlandi, Winter et al PNAS USA 86 3383 (1989) andEP-A-368684.

m-RNA is recovered from hybridoma cells and used to produce cDNA byreverse transcription. The desired gene is then cloned from this cDNA bymeans of the polymerase chain reaction. Suitable primers are V_(H1) FOR2and V_(H1) BACK both disclosed in EP-A-368684.

The DNA produced in this way has a sequence corresponding to the V_(H1)FOR2 primer at one end (the 5' end of the anti-sense strand) and asequence corresponding to the V_(H1) BACK primer at the 5' end of thesense strand. These sequences include Bst EII and Pst I recognitionsites respectively.

The nucleotide sequence which codes for the light chain variable domainis cloned from the hybridoma genome in corresponding manner. When thisis done, the primers used in the polymerase chain reaction are VK1 FORand VK1 BACK as disclosed in EP-A-368684. The DNA which is produced inthis cloning step has a sequence corresponding to the VK1 FOR primer atone end (the 5' end of the anti-sense strand). This includes a Bgl IIrecognition site. At the 5' end of the sense strand is a sequence codedby the VK1 BACK primer and including a Pvu II recognition site. Thissequence is diagrammatically illustrated by FIG. 2.

A nucleotide sequence to code for a linking peptide is synthesised usingstandard techniques for oligonucleotide synthesis. For example, using anautomated synthesiser, chemical reagents and protocols supplied byApplied Biosystems (Warrington, UK). The sequence has a Bst EIIrecognition site close to its 5' end. It has a Pvu II recognition siteclose to its 3' end. Intermediately between these two sites are asequence of codons to produce a short linking peptide. In the sensestrand these comprise a sequence . . . CGA ATG GAT AAA AGG . . . (Seq.ID. No. 1) which codes for . . . Arg-Met-Asp-Lys-Arg . . . (Seq. ID. No.2). These five amino acids provide a protease cleavage site, as will beexplained below. This nucleotide sequence is illustrated in FIG. 3 ofthe accompanying drawings.

The three nucleotide sequences described above (and illustrated in FIGS.1, 2 and 3 of the accompanying diagrammatic drawings) are thenassembled, in-frame, to form a longer sequence in which the sequencecoding for the linking peptide extends between the 5' end of the heavychain sequence and the 3' end of the light chain sequence, asillustrated by FIG. 4.

The construction of a genetic cassette--i.e. an in-frame nucleotidesequence--with the design described above can be carried out by standardtechniques of molecular biology. For instance Clackson et al--Nature 352p. 624 (1991) especially p. 625--describes a suitable method, althoughwithout any cleavage site as here provided in the linking peptide.

The resulting nucleotide sequence, illustrated as FIG. 4, is preferablyamplified further by the polymerase chain reaction and then insertedinto a vector and used to transform a host organism.

The transformed organism is cultured and expresses a polypeptidecontaining the variable domains of the heavy and light chains, with theC-terminal of the heavy chain coupled to the N-terminal of the lightchain variable domain through the linking peptide. This polypeptide isillustrated in FIG. 5.

The peptide link between the V_(H) and V_(L) fragments, as illustratedin FIG. 5, is short. It does not allow these fragments freedom ofmovement sufficient that they can associate together.

The chosen host organism can be the filamentous fungus, Aspergillus. Anenzyme which naturally occurs within this organism is the KEX2 protease.As mentioned above, this protease functions to cleave a peptide chainadjacent to arginine in an exposed

    . . . -Lys-Arg-X- . . . sequence

where X denotes any amino acid. KEX2-type enzymes are found in manydifferent organisms. Once the polypeptide has been formed, it is cleavedby theKEX2 protease of the Aspergillus next to the -Lys-Arg- sequence ofthe linking peptide. Indeed, cleavage may occur after synthesis of onedomain and the linking peptide, while synthesis of the second domain isin progress.

The light and heavy chain variable domains can now associate together asdiagrammatically illustrated in FIG. 6 and can fold into their naturalshape. At the C-terminal end of the heavy chain variable domain, thepeptide chain is prolonged by a fragment of the linker peptide. Theremainder of this peptide prolongs the light chain variable domain atits N-terminal end. These remnants are indicated as "R".

FIG. 7 illustrates, by way of example, the application of the inventionto the production of an Fv (referred to as FvKC) which is specific for apeptide hormone. In this example, the producer organism is amethylotropic yeast, Pichia pastoris, which is known to produce aKEX2-type protease.

1) Generation of DNA construct coding for an Fv comprising a KEX2-typeprocessing site (Arg-Met-Asp-Lys-Arg) positioned between the VH chainand the VL chain

As shown in FIG. 7a, the DNA coding for an Fv (known as FvKC) with aspecificity for a peptide hormone was assembled in an E. coli expressionplasmid, pUC19, according to the method of Ward et al. Nature (1989)341, 544. As in Ward's paper, the Fv was tagged at the C-terminus of itsVL with a peptide tag (to facilitate assay of Fv activity). The peptidesequence (Seq. ID. No. 3) used in this example was the so-calledhydrophil II tag (Gly-Ser-Gly-Ser-Gly-Asn-Ser-Gly-Lys-Gly-Tyr-Leu-Lys).This sequence was previously disclosed in Davis et al (1991) WO91/08482. This starting DNA construct, shown in FIG. 7a, is designatedFvKC-II. FIG. 8 shows the nucleotide sequence and corresponding aminoacid sequence of the FvKC-II genes on the HindIII-EcoRI fragment inpUC19. The pelB leader and hydrophil-II tag sequences are shown inboxes. Relevant restriction sites are shown bold and underlined.

The FvKC-II DNA was amplified by growing it up in E. coli and recoveringthe plasmid DNA. The DNA between the V_(H) and V_(L) was removed bydigesting with BstEII and SacI (as illustrated by FIG. 7b). This wasthen replaced (FIG. 7c) by a synthetic BstEII/SacI fragment encoding aKEX2-type cleavage site which was the Arg-Met-Asp-Lys-Arg sequencementioned earlier. The synthetic fragment was composed of a pair ofcomplementary oligonucleotides, Oligo 1 (Seq. ID No. 7) and Oligo 2which were:

    5'GTCACCGTCTCCTCACGAATGGATAAAAGGGACATCGAGCT'3              Oligo 1

    5'CGATGTCCCTTTTATCCATTCGTGAGGAGACG'3                       Oligo 2

The completed DNA construct was designated FvKC-II-KEX2. FIG. 9 showsthe nucleotide sequence and corresponding amino acid sequence of theFvKC-II-KEX2 gene on the HindIII-EcoRI fragment in pUC19. The pelBleader, KEX2 site and hydrophil-II tag sequences are boxed. Relevantrestriction sites are bold and underlined.

2) Introduction of FvKC-II-KEX2 DNA construct into a Pichia expressionvector

The DNA construct FvKC-II-KEX2 was inserted into an expression vector(pPIC9) for production in the methylotrophic yeast, Pichia pastoris. Toachieve this, the 5' end of the DNA construct had to be modified so thatit contained a restriciton site that was compatible with the pPIC9vector. (The restriction site SnaB1 was chosen). This modification wasmade by removing a HindIII/PstI fragment and replacing it with asynthetic HindIII/PstI fragment that contained a SnaBI site. (Refer toFIGS. 7d, e and f). The synthetic fragment was composed of a pair ofcomplementary oligonucleotides, Oligo 3 (Seq. ID No. 11) and Oligo 4(Seq. ID No. 12) which were:

    5'AGCTTACGTACAGGTGCAGCTGCA 3'                              Oligo 3

    5'GCTGCACCTGTACGTA 3'                                      Oligo 4

The completed DNA construct was designated Yeast-FvKC-II-KEX2. FIG. 10shows the nucleotide sequence and corresponding amino acid sequence ofthe FvKC-II-KEX2 gene containing the 5' SnaBI restriction on theHindIII-EcoRI fragment in pUC19. The KEX2 site and hydrophil-II tagsequences are in clear boxes; the Sna BI and EcoRI restriction sites arein grey boxes. Other relevant restriction sites are bold and underlined.

The Yeast-Fv-II-KEX2 construct was amplified by growing in pUC19/E.coli. The construct was excised from plasmid pUC19 by digestion withSnaBI and EcoRI. (EcoRI is naturally present at the 3' end of theconstruct (refer to FIG. 7f).) The excised DNA was then ligated into thePichia expression vector, pPIC9. This vector is a component of thePichia expression kit (version B), supplied by Invitrogen Corporation,San Diego, USA.!

3) Production and recovery of active FvKC-II from Pichia

The pPIC9 DNA with the insert of Yeast-Fv-II-KEX2 was linearised bydigesting with BglII. Then Pichia pastoris strain GS115 was transformedwith this DNA according to the instruction manual supplied with thePichia expression kit (version B).

500 mls of transformed Pichia culture was produced according to theinstruction manual. This was then centrifuged and induced to expressFvKC-II in a volume of 100 mls, again according to Invitrogen'sinstruction manual. After 48 hours of induction, cells were removed bycentrifugation. 60 mls of the supernatant was loaded onto an affinitychromatography column comprising the peptide hormone (to which FvKCbinds) immobilised on CNBr-activated SEPHAROSE4B (Pharmacia). Afterloading the column, the adsorbent was washed with phosphate bufferedsaline (PBS) and then with 1 column volume of 1M sodium chloride (toeliminate non-specific binding). Bound (and therefore active) Fv wasrecovered by eluting with 50 mM glycine, pH2.2 Elution from the columnwas detected by uv absorption. The chromatogram is shown at FIG. 11.

Recovered fractions were neutralised with tris buffer and then dialysedinto PBS. Two fractions were taken (refer to FIG. 11). Fraction 1 had avolume of 2 mls and contained 36 μg/ml protein. Fraction 2 had a volumeof 4 mls and contained 78 μg/ml protein. Fractions 1 and 2 were analysedby SDS-PAGE. This was conducted with five lanes which were

1. Pharmacia low molecular weight markers;

2 Fraction 1;

3. Fraction 2;

4. Single chain Fv of the same hormone, expressed in E. coli;

5. A mixture of single chain Fv and the separate V_(H) and V_(L) chains,all expressed in E. coli.

The result of SDS-PAGE is reproduced as FIG. 12. It can be seen that forboth lane 2 and lane 3, there were equal quantities of the two proteinchains V_(H) and V_(L). There was negligible single chain Fv, nor werethere fragments of lower molecular weight which would come from(unwanted) random cleavage of the peptide chains.

As neither of these chains can bind the peptide hormone on their own(they can only bind when associated in the form of an Fv), it is clearthat the Pichia had synthesised VH-KEX2-VL-II; that this protein hadbeen cleaved to yield equal amounts of VH and VL-II; and these twochains had associated to produce active Fv therefore being applied tothe affinity column. The Fv dissociated into separate chains when elutedfrom the column and was detected as separate chains by SDS-PAGE.

FIGS. 13 and 14 illustrate the utilisation of this invention in theproduction of a diabody as illustrated by FIG. 15. This is an artificialconstruction containing the variable domains of one antibody (V_(H1) andV_(L1)) and the variable domains (V_(H) 2 and V_(L) 2) of a secondantibody so that the artificial construct will display specific bindingaffinity for two different epitopes.

To make this construct, utilising this invention, nucleotide sequencescoding for each of the heavy chain variable domains and each of thelight chain variable domains are cloned in the manner describedpreviously. These are assembled in the arrangement shown by FIG. 13.

As indicated in this figure, the overall nucleotide sequence includes atone end the nucleotide sequence which codes for the heavy chain variabledomain of one antibody (V_(H1)) and the nucleotide sequence which codesthe light chain variable domain of the second antibody (V_(H2)) Thesenucleotide sequences are connected through a synthetic oligonucleotidesequence designated as link 1. The manner of assembling these nucleotidesequences can be generally as described previously but the link 1sequence must not code for any protease cleavage site. Suitably itcontains only glycine and serine.

The nucleotide sequence coding for the light chain of the variabledomain of the first antibody (V_(L1)) is similarly connected through asynthetic nucleotide sequence, designated link 2, to the nucleotidesequence which codes for the variable domain of the heavy chain of thesecond antibody (V_(H2)).

The sequences coding for each of the light chain variable domains areconnected through a linking sequence indicated as link 3 which codes fora peptide sequence which does contain a cleavage site. Link 3 is hereexemplified as coding for . . . Gly-Lys-Arg . . . but it may code forsome other peptide link containing a suitable cleavage site. When all ofthese nucleotide sequences have been assembled, the resulting sequenceis incorporated into a host organism which is cultured to express thepolypeptide shown by FIG. 14. This is then brought into contact withprotease which functions to sever the link 3 amino acid sequence but notlink 1 or link 2. The resulting two polypeptides are now able toassociate together to give the diabody illustrated by FIG. 15. Remnants"R" of the central link 3 peptide extend from the N-terminal of V_(L1)and the C-terminal of V_(L2).

    __________________________________________________________________________    SEQUENCE LISTING    (1) GENERAL INFORMATION:    (iii) NUMBER OF SEQUENCES: 14    (2) INFORMATION FOR SEQ ID NO:1:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 15 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:    CGAATGGATAAAAGG15    (2) INFORMATION FOR SEQ ID NO:2:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 5 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (iii) HYPOTHETICAL:    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:    ArgMetAspLysArg    15    (2) INFORMATION FOR SEQ ID NO:3:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 13 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:    GlySerGlySerGlyAsnSerGlyLysGlyTyrLeuLys    1510    (2) INFORMATION FOR SEQ ID NO:4:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 996 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: double    (D) TOPOLOGY: circular    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:    AAGCTTGCATGCAAATTCTATTTCAAGGAGACAGTCATAATGAAATACCTATTGCCTACG60    GCAGCCGCTGGATTGTTATTACTCGCTGCCCAACCGGCCATGGCCCAGGTGCAGCTGCAG120    CAGTCTGGGGCTGAACTGGTGAAGCCTGGGCCTTCTGTGAAGCTGTCCTGCAAGGCTTCC180    GACTACACCTTCACCAGTTATTGGATGCACTGGGTGAAGCAGAGGCCTGGACAAGGCCTT240    GAGTGGATTGGAGAGATTAATCCTACCAACGGTCGTACTTATTACAATGAGAAGTTCAAG300    AGCAAGGCAACACTGACTGTAGACAAATCTTCCAGTACAGCCTACATGCAGCTCAGCAGC360    CTGACATCTGAGGACTCTGCGGTCTATTACTGTGCAAGACGGTATGGTAACTCCTTTGAC420    TACTGGGGCCAAGGGACCACGGTCACCGTCTCCTCATAATAAGAGCTATGGGAGCTTGCA480    TGCAAATTCTATTTCAAGGAGACAGTCATAATGAAATACCTATTGCCTACGGCAGCCGCT540    GGATTGTTATTACTCGCTGCCCAACCAGCGATGGCCGACATCGAGCTCACCCAGTCTCCA600    GATTCTTTGGCTGTGTCTCTAGGGCAGAGGGCCACCATATCCTGCAGAGCCAGTGAAAGT660    GTTGATAGTTATGGCAATAGTTTTATGCAGTGGTACCAGCAGAAACCAGGACAGCCACCC720    AAACTCCTCATCTATCGTGCATCCAACCTAGAATCTGGGATTCCTGCCAGGTTCAGTGGC780    ACTGGGTCTAGGACAGACTTCACCCTCACCATTAATCCTGTGGAGGCTGATGATGTTGCA840    ACCTATTATTGTCAACAAAGTGATGAGTATCCGTACATGTACACGTTCGGAGGGGGGACC900    AAGCTCGAGATCAAACGGGGATCCGGTAGCGGGAACTCCGGTAAGGGGTACCTGAAGTAA960    TAAGATCAAACGGTAATAAGGATCCAGCTCGAATTC996    (2) INFORMATION FOR SEQ ID NO:5:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 139 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: protein    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:    MetLysTyrLeuLeuProThrAlaAlaAlaGlyLeuLeuLeuLeuAla    151015    AlaGlnProAlaMetAlaGlnValGlnLeuGlnGlnSerGlyAlaGlu    202530    LeuValLysProGlyProSerValLysLeuSerCysLysAlaSerAsp    354045    TyrThrPheThrSerTyrTrpMetHisTrpValLysGlnArgProGly    505560    GlnGlyLeuGluTrpIleGlyGluIleAsnProThrAsnGlyArgThr    65707580    TyrTyrAsnGluLysPheLysSerLysAlaThrLeuThrValAspLys    859095    SerSerSerThrAlaTyrMetGlnLeuSerSerLeuThrSerGluAsp    100105110    SerAlaValTyrTyrCysAlaArgArgTyrGlyAsnSerPheAspTyr    115120125    TrpGlyGlnGlyThrThrValThrValSerSer    130135    (2) INFORMATION FOR SEQ ID NO:6:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 149 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: protein    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:    MetLysTyrLeuLeuProThrAlaAlaAlaGlyLeuLeuLeuLeuAla    151015    AlaGlnProAlaMetAlaAspIleGluLeuThrGlnSerProAspSer    202530    LeuAlaValSerleuGlyGlnArgAlaThrIleSerCysArgAlaSer    354045    GluSerValAspSerTyrGlyAsnSerPheMetGlnTrpTyrGlnGln    505560    LysProGlyGlnProProLysLeuLeuIleTyrArgAlaSerAsnLeu    65707580    GluSerGlyIleProAlaArgPheSerGlyThrGlySerArgThrAsp    859095    PheThrLeuThrIleAsnProValGluAlaAspAspValAlaThrTyr    100105110    TyrCysGlnGlnSerAspGluTyrProTyrMetTyrThrPheGlyGly    115120125    GlyThrLysLeuGluIleLysArgGlySerGlySerGlyAsnSerGly    130135140    LysGlyTyrLeuLys    145    (2) INFORMATION FOR SEQ ID NO:7:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 41 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:    GTCACCGTCTCCTCACGAATGGATAAAAGGGACATCGAGCT41    (2) INFORMATION FOR SEQ ID NO:8:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 32 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:    CGATGTCCCTTTTATCCATTCGTGAGGAGACG32    (2) INFORMATION FOR SEQ ID NO:9:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 891 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: double    (D) TOPOLOGY: circular    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:    AAGCTTGCATGCAAATTCTATTTCAAGGAGACAGTCATAATGAAATACCTATTGCCTAGG60    GCAGCCGCTGGATTGTTATTACTCGCTGCCCAACCGGCCATGGCCCAGGTGCAGCTGCAG120    CAGTCTGGGGCTGAACTGGTGAAGCCTGGGCCTTCTGTGAAGCTGTCCTGCAAGGCTTCC180    GACTACACCTTCACCAGTTATTGGATGCACTGGGTGAAGCAGAGGCCTGGACAAGGCCTT240    GAGTGGATTGGAGAGATTAATCCTACCAACGGTCGTACTTATTACAATGAGAAGTTCAAG300    AGCAAGGCCACACTGACTGTAGACAAATCTTCCAGTACAGCCTACATGCAGCTCAGCAGC360    CTGACATCTGAGGACTCTGCGGTCTATTACTGTGCAAGACGGTATGGTAACTCCTTTGAC420    TACTGGGGCCAAGGGACCACGGTCACCGTCTCCTCACGAATGGATAAAAGGGACATCGAG480    CTCACCCAGTCTCCAGATTCTTTGGCTGTGTCTCTAGGGCAGAGGGCCACCATATCCTGC540    AGAGCCAGTGAAAGTGTTGATAGTTATGGCAATAGTTTTATGCAGTGGTACCAGCAGAAA600    CCAGGACAGCCACCCAAACTCCTCATCTATCGTGCATCCAACCTAGAATCTGGGATTCTT660    GCCAGGTTCAGTGGCACTGGGTCTAGGACAGACTTCACCCTCACCATTAATCCTGTGGAG720    GCTGATGATGTTGCAACCTATTATTGTCAACAAAGTGATGAGTATCCGTACATGTACACG780    TTCGGAGGGGGGACCAAGCTCGAGATCAAACGGGGATCCGGTAGCGGGAACTCCGGTAAG840    GGGTACCTGAAGTAATAAGATCAAACGGTAATAAGGATCCAGCTCGAATTC891    (2) INFORMATION FOR SEQ ID NO:10:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 271 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: protein    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:    MetLysTyrLeuLeuProThrAlaAlaAlaGlyLeuLeuLeuLeuAla    151015    AlaGlnProAlaMetAlaGlnValGlnLeuGlnGlnSerGlyAlaGlu    202530    LeuValLysProGlyProSerValLysLeuSerCysLysAlaSerAsp    354045    TyrThrPheThrSerTyrTrpMetHisTrpValLysGlnArgProGly    505560    GlnGlyLeuGluTrpIleGlyGluIleAsnProThrAsnGlyArgThr    65707580    TyrTyrAsnGluLysPheLysSerLysAlaThrLeuThrValAspLys    859095    SerSerSerThrAlaTyrMetGlnLeuSerSerLeuThrSerGluAsp    100105110    SerAlaValTyrTyrCysAlaArgArgTyrGlyAsnSerPheAspTyr    115120125    TrpGlyGlnGlyThrThrValThrValSerSerArgMetAspLysArg    130135140    AspIleGluLeuThrGlnSerProAspSerLeuAlaValSerLeuGly    145150155160    GlnArgAlaThrIleSerCysArgAlaSerGluSerValAspSerTyr    165170175    GlyAsnSerPheMetGlnTrpTyrGlnGlnLysProGlyGlnProPro    180185190    LysLeuLeuIleTyrArgAlaSerAsnLeuGluSerGlyIleProAla    195200205    ArgPheSerGlythrGlySerArgThrAspPheThrLeuThrIleAsn    210215220    ProValGluAlaAspAspValAlaThrTyrTyrCysGlnGlnSerAsp    225230235240    GluTyrProTyrMetTyrThrPheGlyGlyGlyThrLysLeuGluIle    245250255    LysArgGlySerGlySerGlyAsnSerGlyLysGlyTyrLeuLys    260265270    (2) INFORMATION FOR SEQ ID NO:11:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 24 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:    AGCTTACGTACAGGTGCAGCTGCA24    (2) INFORMATION FOR SEQ ID NO:12:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 16 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:    GCTGCACCTGTACGTA16    (2) INFORMATION FOR SEQ ID NO:13:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 797 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: double    (D) TOPOLOGY: circular    (ii) MOLECULE TYPE: DNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:    AAGCTTACGTACAGGTGCAGCTGCAGCAGTCTGGGGCTGAACTGGTGAAGCCTGGGCCTT60    CTGTGAAGCTGTCCTGCAAGGCTTCCGACTACACCTTCACCAGTTATTGGATGCACTGGG120    TGAAGCAGAGGCCTGGACAAGGCCTTGAGTGGATTGGAGAGATTAATCCTACCAACGGTC180    GTACTTATTACAATGAGAAGTTCAAGAGCAAGGCCACACTGACTGTAGACAAATCTTCCA240    GTACAGCCTACATGCAGCTCAGCAGCCTGACATCTGAGGACTCTGCGGTCTATTACTGTG300    CAAGACGGTATGGTAACTCCTTTGACTACTGGGGCCAAGGGACCACGGTCACCGTCTCCT360    CACGAATGGATAAAAGGGACATCGAGCTCACCCAGTCTCCAGATTCTTTGGCTGTGTCTC420    TAGGGCAGAGGGCCACCATATCCTGCAGAGCCAGTGAAAGTGTTGATAGTTATGGCAATA480    GTTTTATGCAGTGGTACCAGCAGAAACCAGGACAGCCACCCAAACTCCTCATCTATCGTG540    CATCCAACCTAGAATCTGGGATTCCTGCCAGGTTCAGTGGCACTGGGTCTAGGACAGACT600    TCACCCTCACCATTAATCCTGTGGAGGCTGATGATGTTGCAACCTATTATTGTCAACAAA660    GTGATGAGTATCCGTACATGTACACGTTCGGAGGGGGGACCAAGCTCGAGATCAAACGGG720    GATCCGGTAGCGGGAACTCCGGTAAGGGGTACCTGAAGTAATAAGATCAAACGGTAATAA780    GGATCCAGCTCGAATTC797    (2) INFORMATION FOR SEQ ID NO:14:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 252 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: protein    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:    AlaTyrValGlnValGlnLeuGlnGlnSerGlyAlaGluLeuValLys    151015    ProGlyProSerValLysLeuSerCysLysAlaSerAspTyrThrPhe    202530    ThrSerTyrTrpMetHisTrpValLysGlnArgProGlyGlnGlyLeu    354045    GluTrpIleGlyGluIleAsnProThrAsnGlyArgThrTyrTyrAsn    505560    GluLysPheLysSerLysAlaThrLeuThrValAspLysSerSerSer    65707580    ThrAlaTyrMetGlnLeuSerSerLeuThrSerGluAspSerAlaVal    859095    TyrTyrCysAlaArgArgTyrGlyAsnSerPheAspTyrTrpGlyGln    100105110    GlyThrThrValThrValSerSerArgMetAspLysArgAspIleGlu    115120125    LeuThrGlnSerProAspSerLeuAlaValSerLeuGlyGlnArgAla    130135140    ThrIleSerCysArgAlaSerGluSerValAspSerTyrGlyAsnSer    145150155160    PheMetGlnTrpTyrGlnGlnLysProGlyGlnProProLysLeuLeu    165170175    IleTyrArgAlaSerAsnLeuGluSerGlyIleProAlaArgPheSer    180185190    GlyThrGlySerArgThrAspPheThrLeuThrIleAsnProValGlu    195200205    AlaAspAspValAlaThrTyrTyrCysGlnGlnSerAspGluTyrPro    210215220    TyrMetTyrThrPheGlyGlyGlyThrLysLeuGluIleLysArgGly    225230235240    SerGlySerGlyAsnSerGlyLysGlyTyrLeuLys    245250    __________________________________________________________________________

We claim:
 1. A method of preparing a proteinaceous product whichincorporates associative portions of antibody light and heavy chains,by:connecting nucleotide sequences which code for the portions of thetwo chains, by means of an additional nucleotide sequence which isinterposed between them; transforming a host organism to incorporate theconnected nucleotide sequences; culturing the transformed organism toexpress a polypeptide which contains the portions of the light and heavychains, joined by a linking peptide sequence coded by the saidadditional nucleotide sequence; characterised in that a cleavage site inthe linking peptide sequence is such that this linking peptide is cutenzymatically by an enzyme produced by the transformed organism followedby recovery of the proteinaceous product, wherein the linking peptidewhich joins the portions of the light and heavy chains is too short toallow the portions to associate with each other before the linkingpeptide is cut.
 2. A method according to claim 1 wherein the geneticmaterial of the host organism is transformed to express the protease aswell as being transformed to express the said polypeptide.
 3. A methodaccording to claim 1 wherein the associative portions of the two chainsinclude at least the antigen binding regions of the variable domains ofthe light and heavy chains.
 4. A method according to claim 1, whereinthe associative portions are the variable domains of the light and heavychains of a single antibody so that the product of the method is an Fvantibody fragment.
 5. A method according to claim 1, wherein theassociative portions are the variable domains of the light and heavychains of two antibodies so that the product of the method is a diabodyincorporating two Fv antibody fragments.
 6. A method according to claim1 incorporating a purification step carried out by binding a support toa sequence of amino acids of the linking polypeptide.
 7. A methodaccording to claim 1 accompanied by an assay step carried out by bindinga support to a sequence of amino acids of the linking polypeptide.
 8. Amethod according to claim 1, wherein the linking peptide contains morethan one cleavage site and exposure of the polypeptide to the saidenzyme entirely detaches the part of the link between the two cleavagesites.
 9. A method according to claim 8 wherein the protease is of theKEX2 type.
 10. A method according to claim 8, wherein the enzyme to cutthe linking peptide sequence is a protease.
 11. A method according toclaim 10 wherein the protease cuts the linking peptide between arginineand an adjacent amino acid X in a sequence

    . . . lysine-arginine-X . . .

where X denotes any amino acid.
 12. A method according to claim 1wherein the transformed organism is a yeast.
 13. A method according toclaim 12 wherein the transformed organism is a methylotropic yeast.